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I.  INTRODUCTION 


This  Note  is  the  user’s  manual  for  the  Longitudinal  Scalogram  Analysis  (LSA) 
program.  LSA  is  an  extension  of  cross-sectional  scalogram  analysis  to  longitudinal  data. 
An  application  of  this  program  using  Project  ALERT  data  is  provided  in  Ellickson,  Hays, 
and  Bell  (forthcoming). 
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II.  UNDERSTANDING  LONGITUDINAL  SCALOGRAM  ANALYSIS 


Unitary  growth  characterizes  a  variety  of  different  developmental  processes  including 
intellectual  development  (e.g.,  Bayley,  1955),  drug  use  involvement  (e.g.,  Kandel,  1975), 
moral  development  (Walker,  deVries,  and  Bichard,  1984),  and  functional  health  (Stewart, 
Ware,  and  Brook,  1981).  The  common  feature  of  these  different  domains  is  a  deterministic, 
cumulative  sequeiKe  of  development.  Cross-sectional  Guttman  scale  analysis  has  been 
employed  as  the  “bread  and  butter”  method  for  evaluating  these  processes  (Guttman,  1944). 
The  Guttman  scale  model  is  straightforward  and  easy  to  interpret.  If  observed  data  fit  a 
Guttman  scale,  then  all  persons  with  the  same  scale  score  (i.e.,  sum  of  endorsed  items  in  the 
scale)  have  identical  responses  to  each  item  in  the  scale.  In  general,  the  number  of  possible 
response  patterns  is  two  raised  to  a  power  equal  to  the  number  of  items,  but  the  number  of 
response  patterns  consistent  with  a  Guttman  scale  equals  the  number  of  items  plus  one 
(Dotson  and  Summers,  1970;  Schwartz,  1986). 

Table  1  presents  the  item  response  patterns  expected  for  three  items  forming  a 
Guttman  scale  of  measurement:  magnitude,  equal  interval,  and  absolute  zero.  Eight 
response  patterns  are  possible,  but  only  the  four  shown  in  Table  1  are  consistent  with  a 
Guttman  scale.  Knowing  that  a  scale  has  an  absolute  zero  point  allows  for  the  inference  that 
it  has  equal  intervals  and  that  it  has  the  property  of  magnitude. 

Similarly,  knowing  that  a  scale  has  equal  intervals  leads  to  the  prediction  that  it 
possesses  the  property  of  magnitude.  In  contrast,  knowing  that  a  scale  has  magnitude  docs 
not  allow  one  to  infer  whether  or  not  it  has  equal  intervals  or  an  absolute  zero  point. 

Table  1 

EXAMPLE  OF  PATTERN  OF  RESPONSES  TO  THREE  ITEMS 
HTTING  PERFECTLY  A  CROSS-SECTIONAL 
GUTTMAN  SCALE 


Type  of  Scale 

Magnitude? 

Equal 

Interval? 

Absolute 

Zero? 

Total 

Score 

Nominal 

No 

No 

No 

0 

Ordinal 

Yes 

No 

No 

1 

Interval 

Yes 

Yes 

No 

2 

Ratio 

Yes 

Yes 

Yes 

3 
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The  scalability  of  responses  is  determined  by  comparing  observed  patterns  of  data 
with  the  patterns  predicted  for  a  Guttman  scale,  examining  the  degree  to  which  observed 
response  patterns  deviate  from  expected  response  patterns.  The  coefficient  of 
reproducibility  (CR)  for  Guttman  scales  is  defined  as  the  proportion  of  error  (i.e.,  proportion 
of  differences  between  observed  and  expected  responses)  subtracted  from  unity.  A  CR 
value  of  0.90  or  higher  is  considered  acceptable.  In  addition,  an  index  of  reproducibility  is 
typically  computed  by  determining  how  well  item  modes  reproduce  the  observed  response 
patterns.  Errors  are  counted  as  differences  between  each  observed  item  response  for  an 
individual  and  the  modal  response  for  that  item  across  all  respondents  using  the  Goodenough 
(1944)  procedure.  This  index,  the  minimum  marginal  reproductibility  (MR),  is  used  to 
calculate  the  coefficient  of  scalability  (CS)  defined  as  (CR  -  MR)/(1  -  MR).  A  CS  of  0.60 
has  been  recommended  as  a  minimum  standard  for  acceptability  (Menzel,  1953). 

Traditional  Guttman  scalogram  analysis  is  limited  to  evaluating  item  order  cross- 
sectionally.  Longitudinal  scalogram  analysis  (LSA)  is  an  extension  of  traditional  scalogram 
analysis  that  incorporates  the  element  of  time  (Hays  and  Ellickson,  1990).  Table  2  presents 
response  patterns  for  three  items  measured  at  three  time  points.  As  illustrated  in  Table  2, 
only  one  pattern  of  responses  is  longitudinally  consistent  with  a  total  score  of  0, 1,  8,  or  9 
“yes”  answers.  However,  there  are  two  different  response  patterns  consistent  with  two  or 
seven  affirmative  answers  and  three  different  response  patterns  consistent  with  a  total  of 
three,  four,  five,  or  six  affirmative  answers.  For  example,  a  total  score  of  2  may  be  obtained 
for  a  scale  having  the  property  of  magnitude  at  time  2  and  time  3  or  by  a  scale  having 
magnitude  and  equal  interval  properties  at  time  3  (assuming  that  scales  can  change  over 
time).  In  general,  the  number  of  possible  response  patterns  is  two  raised  to  a  power  equal  to 
the  product  of  the  number  of  items  and  waves.  The  number  of  patterns  consistent  with  a 
longitudinal  Guttman  scale  is: 

(items  +  waves)! 

items!  waves! 

Because  of  the  multiple  response  patterns  consistent  with  a  longitudinal  Guttman  scale, 
calculating  reproducibility  and  scalability  is  not  as  straightforward  for  longitudinal  as  it  is  for 
cross-sectional  data. 


-4- 


Table2 


EXAMPLE  OF  PATTERN  OF  RESPONSES  TO  THREE  ITEMS 
FirriNG  PERFECTLY  A  THREE-WAVE  LONGITUDINAL 
GUTTMAN  SCALE 


A1 

B1 

Cl 

A2 

B2 

C2 

A3 

B3 

C3 

Total 

Score 

No 

No 

No 

No 

No 

No 

No 

No 

No 

0 

No 

No 

No 

No 

No 

No 

Yes 

No 

No 

1 

No 

No 

No 

Yes 

No 

No 

Yes 

No 

No 

2 

No 

No 

No 

No 

No 

No 

Yes 

Yes 

No 

2 

No 

No 

No 

Yes 

No 

No 

Yes 

Yes 

No 

3 

No 

No 

No 

No 

No 

No 

Yes 

Yes 

Yes 

3 

Yes 

No 

No 

Yes 

No 

No 

Yes 

No 

No 

3 

No 

No 

No 

Yes 

Yes 

No 

Yes 

Yes 

No 

4 

No 

No 

No 

Yes 

No 

No 

Yes 

Yes 

Yes 

4 

Yes 

No 

No 

Yes 

No 

No 

Yes 

Yes 

No 

4 

No 

No 

No 

Yes 

Yes 

No 

Yes 

Yes 

Yes 

5 

Yes 

No 

No 

Yes 

Yes 

No 

Yes 

Yes 

No 

5 

Yes 

No 

No 

Yes 

No 

No 

Yes 

Yes 

Yes 

5 

No 

No 

No 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

6 

Yes 

Yes 

No 

Yes 

Yes 

No 

Yes 

Yes 

No 

6 

Yes 

No 

No 

Yes 

Yes 

No 

Yes 

Yes 

Yes 

6 

Yes 

No 

No 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

7 

Yes 

Yes 

No 

Yes 

Yes 

No 

Yes 

Yes 

Yes 

7 

Yes 

Yes 

No 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

8 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

Yes 

9 

NOTE:  A  =  magnitude,  B  =  equal  interval,  C  =  absolute  zero. 


With  longitudinal  data,  the  expected  pattern  against  which  observed  scores  are 
coir  i^ared  cannot  be  determined  solely  on  the  basis  of  the  total  score  across  items.  However, 
identification  of  all  longitudinal  patterns  that  are  consistent  with  the  Guttman  model  and 
yield  the  total  score  observed  for  each  individual  can  be  used  to  select  the  pattern  (i.e., 
“expected  pattern”)  that  is  minimally  different  from  observed  scores.  Table  3  provides  an 
example  of  selecting  the  expected  pattern  for  a  total  score  of  5  and  observed  score  pattern  of 
001  111  100  for  three  items  measured  at  three  time  points.  The  minimum  difference 
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between  the  observed  pattern  and  the  three  patterns  consistent  with  a  longitudinal  Guttman 
scale  and  yielding  the  same  total  score  is  4.  This  difference  is  observed  for  two  of  the  three 
patterns;  thus,  either  of  these  patterns  can  serve  as  the  expected  pattern  (i.e.,  they  are 
equivalent  for  the  purpose  of  computing  scalogram  errors).* 

Once  the  expected  pattern  has  been  determined,  longitudinal  coefficients  of 
reproducibility  (LCR)  and  scalability  (LCS)  can  be  computed  as  in  cross-sectional  Guttman 
scalogram  analysis.  Subtracting  the  proportion  of  errors  from  unity  yields  LCR.  LCS  is 
defined  as  the  difference  between  LCR  and  the  reproducibility  of  items  from  their  modes 
(LMR),  divided  by  LMR  subtracted  from  unity:  LCS  =  (LCR  -  LMR)/(1  -  LMR).^ 

Previous  research  using  Guttman  scalogram  analysis  has  not  repotted  estimates  of 
sampling  error  for  the  coefficient  of  reproducibility.  Green  (1956)  noted  that  the  standard 
error  of  the  CR  can  be  approximated  by  [CR  (1  -  CR)/N  an  adaptation  of  the  formula 
for  the  standard  error  of  a  proportion  (N  =  number  of  respondents,  K  =  number  of  items). 


Table  3 

COMPARING  EXAMPLE  PATTERN  TO  PATTERNS  CONSISTENT 
WITH  A  LONGITUDINAL  GUTTMAN  SCALE:  THREE  ITEMS, 
THREE  WAVES,  AND  A  TOTAL  SCORE  OF  5 


Time 

1 

2 

3 

Difference 

Item 

Item 

Item 

Between 

123 

1  23 

123 

Patterns 

Type  of  Re^nse  Pattern 

001 

1  1  1 

100 

— 

Example  pattern 

000 

1  10 

1  1  1 

4 

Longitudinally  consistent  pattern  #1 

100 

1  10 

1  10 

4 

Longitudinally  consistent  pattern  #2 

100 

1  00 

1  1  1 

6 

Longitudinally  consistent  pattern  #3 

NOTE:  0  =  not  passed,  1  =  passed. 


*As  an  alternative  to  narrowing  down  the  potential  expected  patterns  based  on  the 
total  score,  one  can  compare  each  score  with  all  longitudinally  consistent  patterns  to  identify 
the  pattern  that  is  least  different.  This  alternative  procedure  yields  scaling  coefficients 
(reproducibility  and  scalability)  that  are  as  large  or  larger  than  those  obtained  from  the 
standard  method.  However,  we  observed  a  tenfold  increase  in  execution  time  using  this 
alternative  procedure  on  Project  ALERT  data  (Ellickson,  Hays,  and  Bell,  forthcoming). 

^The  LSA  error-counting  procedure  is  directly  analogous  to  cross-sectional  Guttman 
scalogram  analysis  and  weights  equally  different  seeing  inconsistencies.  An  argximent 
could  be  made  for  differential  weighting  of  errors  (e.g.,  endorsing  an  item  out  of  sequence  at 
wave  1  might  be  considered  worse  than  endorsing  the  same  item  at  a  later  wave). 
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Although  this  formula  provides  a  reasonable  approximation  for  the  original  Cornell  method 
of  calculating  reproducibility,  it  requires  modification  for  use  with  the  “double-counting” 
Goodenough  (1944)  scoring  method.  The  following  formula  is  more  appropriate  for 
estimating  the  standard  error  of  reproducibility  for  Goodenough  scoring:  [(1  +  CR)  (1  - 
CR)/N  Tjjg  Longitudinal  Scalogram  Analysis  program  computes  approximate 
standard  errors  using  this  latter  formula  and  it  calculates  the  actual  standard  errors,  using  the 
fact  that  the  coefficient  for  a  sample  is  the  average  of  coefficients  for  members  of  the 
sample. 
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III.  USING  THE  LONGITUDINAL  SCALOGRAM  ANALYSIS  PROGRAM 


The  Longitudinal  Scalogram  Analysis  program,  LSA.EXE,  is  a  compiled  BASIC 
program  that  runs  under  the  DOS  2.0  or  later  on  IBM  PC  or  compatible  microcomputers. 
LSA.EXE  outputs  the  proportion  of  the  sample  passing  each  item,  the  number  of 
respondents  in  the  analysis,  a  frequency  distribution  of  the  number  of  scaling  errors,  and  the 
longitudinal  coefficients  of  reproducibility  and  scalability,  LCR  and  LCS.  Cross-sectional 
coefficients  of  reproducibility  and  scalability  are  provided  for  each  wave  of  data.  In 
addition,  the  universe  of  response  patterns  perfectly  consistent  with  a  longitudinal  Guttman 
scale  for  the  given  number  of  items  and  waves  is  printed,  .sorted  by  the  number  of  endorsed 
items.  LSA.EXE  is  limited  to  four  waves  (time  points)  of  data  and  nine  items  per  wave  (if 
four  waves  of  data  are  analyzed).  A  sample  size  of  up  to  4,500  cases  can  be  analyzed  (the 
frequency  of  all  response  patterns  is  available  only  for  sample  sizes  of  1,250  or  less). 

To  run  the  LSA.EXE  program,  the  user  needs  a  raw  data  (ASCII)  input  file.  Table  4 
provides  an  example  raw  data  file,  RAW  (the  default  file  name),  consi.sting  of  1 1 
respondents,  two  waves  of  data,  and  three  items  at  each  wave.  This  raw  data  file  has  been 
coastructed  so  that  more  recent  data-collection  waves  precede  later  waves.  In  the  example, 
wave  2  data  appear  first,  followed  by  wave  1  data.  However,  the  user  can  arrange  the  data 
in  any  order  desired.  Items  in  the  analysis  are  coded  as  cither  “0”  (item  not  endorsed)  or  “1" 
(item  endorsed).  If  any  of  the  input  items  has  a  value  other  than  “0”  or  “1 ,”  LSA.EXE 
excludes  the  case  from  the  analysis. 


Tabic  4 


EXAMPLE  RAW  FILE 


1100000 

1000000 

1000000 

1100100 

mono 

1111111 

1101101 

1011011 

1011011 

0000000 

0111101 
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Program  input  specifications  are  supplied  in  a  second  file,  as  shown  in  the  example 
input  specification  file  in  Table  5.  This  input  specification  file,  INPUT  (the  default  file 
name),  consists  of  eight  keywords:  TITLE,  NCASES,  WAVES,  SELECT,  HOWREAD, 
ITEMS,  LCSMAX,  and  FREQUENCY.  The  TITLE  keyword  is  followed  by  a  one  line 
descriptive  title.  Following  the  NCASES  keyword,  the  user  specifies  the  number  of 
respondents  in  the  RAW  input  file.  The  number  of  data  waves  are  indicated  after  the 
WAVES  keyword.  The  SELECT  keyword  is  optional  and  is  used  only  when  one  wants  to 
select  a  subset  of  the  RAW  cases  for  analysis.  If  the  SELECT  option  is  used,  the  line 
following  the  SELECT  keyword  is  used  to  designate  the  value  of  the  selection  variable. 

The  HOWREAD  keyword  appears  next.  Following  the  HOWREAD  keyword  is  the 
full  input  specification  (FORTRAN-type  input  format),  including  the  SELECT  variable  (if 
applicable)  and  analysis  variables.  If  the  SELECT  keyword  is  not  used  (and  therefore  the 
whole  sample  is  used),  then  the  input  specification  following  the  HOWREAD  keywords 
includes  only  items  in  the  analysis. 


Table  5 

EXAMPLE  INPUT  FILE 


TITLE 

Sample  Data  File  of  1 1  Cases 
NCASES 
11 

WAVES 

2 

SELECT 

1 

HOWREAD 

(711) 

ITEMS 

6 

‘LOW2’  1 
‘MED2’  2 
‘HIGH2’  3 
‘LOWr  4 
‘MEDl’  5 
‘HIGH!’  6 
LCSMAX 
Yes 

FREQUENCY 

Yes 

END 
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The  ITEMS  keyword  is  listed  next,  followed  by  the  number  of  items  in  the  analysis 
(number  of  items  at  each  wave  times  the  number  of  waves).  The  item  names  are  listed  on 
consecutive  lines  corresponding  to  the  HOWREAD  input  specification.  On  each  line 
following  the  item  name  is  a  rank  order  number.  The  numbers  following  the  item  names 
collectively  inform  LSA.EXE  about  the  hypothesized  structure  in  the  data.  LSA.EXE  uses 
these  numbers  to  order  the  items  for  analysis.  The  number  adjacent  to  the  first  item  name 
indicates  where  the  first  item  in  the  sequence  at  the  most  recent  wave  is  located  among  all 
items  in  the  analysis.  In  the  example  input  file  in  Table  5,  the  number  “1”  is  shown  next  to 
the  LOW2  item,  “2”  next  to  the  MED2  item,  and  so  forth.  The  “1”  tells  LSA.EXE  that  the 
first  item  in  the  sequence  (at  the  most  recent  wave)  is  ordered  first  in  the  list  of  items.  Thus, 
the  first  item  for  this  example  is  LOW2.  Similarly,  the  “2”  informs  the  program  that  the 
second  item  in  the  list  of  items  is  ordered  second  in  the  list  of  items;  therefore,  the  second 
item  in  the  sequence  is  MED2.  The  third  number  in  the  column  of  numbers  designates  the 
location  of  the  third  item  at  the  most  recent  wave,  and  so  on.  Once  the  items  for  the  most 
recent  wave  are  completed,  the  corresponding  items  at  eariier  waves  are  designated.  If 
MED2  was  hypothesized  as  the  first  item  in  the  sequence  at  the  most  recent  wave  and 
LOW2  as  the  second  item,  then  this  section  of  INPUT  file  would  be  changed  as  foUows: 

ITEMS 

6 

‘LOW2’  2 
‘MED2’  1 
‘HIGH2’  3 
‘LOWr  5 
‘MEDl’  4 
‘HIGHl’  6 

The  LCSMAX  keyword  indicates  whether  or  not  the  user  wants  the  program  to 
compute  longitudinal  coefficients  of  scalability  by  comparing  each  score  with  all 
longitudinally  consistent  patterns  to  identify  the  pattern  that  is  least  different.  As  noted 
above,  this  alternative  procedure  is  more  computationally  intensive  than  the  standard 
method.  If  these  additional  coefficients  are  desired,  the  LCSMAX  keyword  needs  to  be 
followed  by  a  line  with  the  word  “Yes”  (upper  or  lowercase  is  acceptable).  Otherwise  this 
line  should  contain  the  word  “No.”  Similarly,  the  FREQUENCY  keyword  is  used  if  a 
frequency  distribution  of  responses  is  desired. 
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After  RAW  and  INPUT  have  been  created,  execution  is  initiated  by  typing  “GO”  and 
touching  the  Enter  (Return)  key.  The  GO  command  activates  a  batch  file  that  calls  three 
subprograms.  The  first,  LS.EXE,  reads  the  input  specification  file  (e.g.,  INPUT)  and  the 
input  raw  data  file  (e.g.,  RAW)  and  writes  out  a  new  file,  OUTPUT,  that  integrates  the  two 
input  files.  Next,  the  main  subprogram,  LLL.EXE,  executes  and  writes  out  the  primary 
scalogram  output  to  one  file  and  the  universe  of  perfect  longitudinal  patterns  for  the  given 
number  of  items  and  waves  to  a  separate  file.  Finally,  the  last  subprogram,  LL.EXE, 
computes  the  frequencies  of  response  patterns,  if  frequencies  were  requested  using  the 
FREQUENCY  keyword.  The  batch  file  integrates  the  output  of  the  subprograms  together 
into  one  file,  OUTPUT.  This  output  file  can  be  printed  using  the  DOS  “print”  command. 

The  OUTPUT  file  produced  by  the  example  RAW  and  INPUT  files  is  given  in  Table 
6.  Note  that  nine  of  the  1 1  respondents  were  selected  for  the  analysis  on  the  basis  of  the 
selection  criteria. 

Included  on  the  distribution  diskette  is  a  program,  PRELSA.EXE,  that  can  be  used  to 
create  the  input  specification  file  for  LSA.EXE.  PRELSA.EXE  was  written  as  a  user- 
friendly  device  for  those  who  prefer  answering  structured  questions  rather  than  creating  the 
input  specification  file  directly. 

The  user  runs  PRELSA.EXE  by  typing  “PRELSA”  and  touching  the  Enter  (Return) 
key.  PRELSA.EXE  then  asks  series  of  questions  and  uses  the  responses  to  create  an  input 
specification  file,  INPUT.  (Warning:  If  a  file  is  saved  on  the  default  drive  with  the  name 
“INPUT,”  it  will  be  overwritten  when  PRELSA  is  executed.)  PRELSA.EXE  seeks  eight 
pieces  of  information:  the  title  for  the  analysis,  the  number  of  cases  in  the  raw  data  file,  the 
number  of  waves  of  data,  whether  or  not  a  subsample  analysis  will  be  done,  the  number  of 
items  in  the  analysis,  the  selection  variable  and  its  column  location  (if  applicable),  item 
names,  column  locations  and  rank  ordering  of  items,  whether  or  not  errors  are  to  be 
calculated  using  the  intensive  computation  method,  and  whether  or  not  the  frequency  of 
response  patterns  will  be  printed.  The  text  of  these  inquiries  is  provided  in  Table  7. 
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Table  6 

EXAMPLE  OUTPUT  FILE 


LONGITUDINAL  SCALOGRAM  ANALYSIS  (LSA)  PROGRAM  (VERSION  2.1) 
BY  R.  D.  HAYS 
RAND 

Sample  Data  File  of  1 1  Cases 

ITEM  PROPORTION  PASSING 

Wave  =  2 

1  0.56  LOW2 

2  0.44  MED2 

3  0.44  HIGH2 


Wave=  1 

1  0.44  LOWl 

2  0.44  MEDl 

3  0.44  HIGHl 


NUMBER  OF  SUBJECTS  =  9 

LONGITUDINAL  SCALOGRAM  ANALYSIS 


95%  Confidence  Interval 


COEFFICIENT  OF  REPRODUCIBILITY  (MAX)  =  0.8889 
COEFFICIENT  OF  SCALABILITY  (MAX)  =  0.7500 


COEFFICIENT  OF  REPRODUCIBILITY  (LCR)  =  0.8 148 

ESTIMATED  STANDARD  ERROR  OF  LCR  =  0.0789 

ACTUAL  STANDARD  ERROR  OF  LCR  =  0.0980 

MINIMUM  MARGINAL  REPRODUCIBILITY  =  0.5556 

PERCENT  IMPROVEMENT  =  0.2593 

COEFFICIENT  OF  SCALABILITY  =  0.5833 

PROPORTION  PERFECT  GUTTMAN  PATTERNS  =  0.6667 


(0.6188  -  1.10108) 


CROSS-SECTIONAL  SCALOGRAM  ANALYSIS 


COEFFICIENT  OF  REPRODUCIBILITY  WAVE  2 
ESTIMATED  STANDARD  ERROR  OF  LR 
ACTUAL  STANDARD  ERROR  OF  LR 
MINIMUM  MARGINAL  REPRODUCIBILITY 
PERCENT  IMPROVEMENT 
COEFFICIENT  OF  SCALABILITY 


0.7778  (0.5556-  1.0000) 

0.1210 
0.1111 
0.5556 
0.2222 
0.5000 


COEFFICIENT  OF  REPRODUCIBILITY  WAVE  1 
ESTIMATED  STANDARD  ERROR  OF  LR 
ACTUAL  STANDARD  ERROR  OF  LR 
MINIMUM  MARGINAL  REPRODUCIBILITY 
PERCENT  IMPROVEMENT 
COEFFICIENT  OF  SCALABILITY 


0.7778  (0.5556-  1.0000) 

0.1210 
0.1111 
0.5556 
0.2222 
0.5000 
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Table  6 — continued 


FREQUENCY  OF  SCALING  ERRORS 

0  ******  (  6) 

2*(  1) 

4  **  (  2) 


FREQUENCIES  FOR  ALL  RESPONSE  PATTERNS: 

Pattern  Frequency 
W2  W1 


000  000  :  (  2) 

000  000  :  (  2) 

000  000  :  (  2) 

Oil  Oil  ;  (  2) 

100  000  :  (  1) 

100  100  :  (  1) 

101  101  :  (  1) 

110  110  :  (  1) 

111  111  :  (  1) 


PERFECT  LONGITUDINAL  PATTERNS  FOR  GIVEN  NUMBER  OF  ITEMS  AND  WAVES 


N  PASSED 

SEQUENCE 

PATTERN 

0 

1 

000000 

1 

2 

100000 

2 

3 

100100 

2 

4 

100100 

2 

5 

110000 

3 

6 

110100 

3 

7 

110  100 

3 

8 

111000 

4 

9 

110110 

4 

10 

110  110 

4 

11 

110110 

4 

12 

111  100 

4 

13 

111  100 

5 

14 

111  no 

5 

15 

111  no 

5 

16 

111  no 

6 

17 

111  111 

6 

18 

111  in 

6 

19 

111  111 

6 

20 

111  in 
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Table  7 

DIALOG  OFPRELSA.EXE 


1 .  WHAT  IS  THE  TITLE  FOR  THIS  ANALYSIS? 

(TYPE  80  ALPHANUMERIC  COLUMNS  OR  LESS) 

2.  HOW  MANY  CASES  ARE  THERE  IN  THE  RAW  DATA  FILE? 

3.  HOW  MANY  WAVES  OF  DATA  ARE  THERE? 

4.  IS  THIS  A  SUBSAMPLE  ANALYSIS? 

THAT  IS.  ARE  YOU  SUBSETTING  THE  SAMPLE? 

1  =  YES 

2  =  NO 

4B.  WHAT  VALUE  ARE  YOU  SELECTING  ON? 

(VALUE  OF  THE  SELECTION  VARIABLE  USED  TO  SELECT  THE  SUBSAMPLE) 

5.  HOW  MANY  ITEMS  ARE  IN  THE  ANALYSIS? 

(NUMBER  OF  ITEMS  AT  EACH  WAVE  X  NUMBER  OF  WAVES) 

5B.  SELECTION  VARIABLE: 

BEGINS  IN  ENDS  IN 
NAME  COLUMN  COLUMN 


6.  PLEASE  TYPE  THE  ITEM  NAME.  COLUMN  LOCATION  (IN  RAW  DATA  FILE). 

AND  RANK  ORDER  OF  EACH  ITEM  IN  THE  ANALYSIS. 

RANK  ORDER  1  IS  THE  ITEM  HYPOTHESIZED  TO  BE  MOST  PREVALENT  AT  THE  MOST 
RECENT  WAVE.  RANK  ORDER  2  IS  THE  ITEM  HYPOTHESIZED  TO  BE  SECOND  MOST 
PREVALENT  AT  THE  MOST  RECENT  WAVE. 

ITEM  NAME  COLUMN  NUMBER  RANK  ORDER 


ITEM:  ? 

7 

7 

7.  DO  YOU  WANT  TO  CALCULATE  ERRORS  USING  THE  INTENSIVE  COMPUTATION 
METHOD? 

1  =YES 

2  =  NO 

8.  DO  YOU  WANT  A  PRINTOUT  OF  THE  FREQUENCY  OF  RESPONSE  PATTERNS? 

1  =  YES 

2  =  NO 
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IV.  APPLYING  LONGITUDINAL  SCALOGRAM  ANALYSIS 


Kandel  and  Faust  (1975)  provided  cross-tabulations  of  drug  use  stages  at  the  end  of 
the  senior  year  by  use  reported  during  a  subsequent  five  to  nine  month  time  interval  for  872 
public  secondary  school  students.  Applying  the  LSA  methodology  to  these  data  allows  an 
evaluation  of  the  hypothesis  that  cumulative  drug  use  reported  at  the  end  of  high  school 
continues  as  current  use  during  a  time  span  immediately  following  high  school. 

About  95  percent  of  the  sample  reported  drug  use  that  was  cross-sectionally 
consistent  (i.e.,  had  no  errors)  at  both  time  points  with  a  seven-level  Guttman  scale:  nonuse, 
use  of  legal  drugs,  cannabis,  pills,  psychedelics,  cocaine,  and  heroin.  The  LSA  analysis  was 
restricted  to  these  respondents  (n  =  791),  because  complete  information  about  response 
patterns  was  not  discernible  in  the  original  article  for  the  rest  of  the  sample.  The  data  for 
this  subsample  (see  Table  8)  support  the  hypothesized  longitudinal  Guttman  scale,  although 
there  were  some  relapses  (i.e.,  items  not  passed  at  time  2  that  were  passed  at  time  1)  and 
these  are  reflected  in  the  less-than-pcrfect  longitudinal  scalogram  coefficients  (LCR  =  0.97, 
LCS  =  0.72).  Cross-sectional  Guttman  scale  analysis  of  the  two  waves  of  data  is  insensitive 
to  these  relapses  (i.e.,  CS  =  1.0  at  both  time  points),  because  it  ignores  the  dimension  of 
time. 

Examination  of  the  longitudinal  scaling  errors  reveals  that  the  majority  involve  two 
types:  persons  who  reported  (1)  having  tried  legal  drugs  but  abstained  after  high  school,  and 
persons  who  reported  (2)  having  tried  legal  drugs  and  carmabis  but  abstained  from  cannabis 
after  high  .school. 

In  the  special  case  where  no  longitudinal  transitions  occur  (i.e.,  the  cross-sectional 
hierarchy  among  items  contains  all  the  information,  as  in  the  example  shown  in  Table  9),  the 
LCS  index  is  not  simply  the  average  of  the  cross-sectional  scalability  coefficients.  In 
general,  the  LCS  value  will  exceed  the  average  of  the  CS  values  because  longitudinal  data 
offer  greater  flexibility  in  identifying  target  response  patterns  that  minimize  scalability 
errors.  For  example,  LCS  =  0.62  for  the  data  shown  in  Table  9  while  CS  =  0.50  for  both 
waves  of  data. 

Table  10  provides  a  clear  example  of  why  the  same  prevalence  rates  can  result  in 
different  longitudinal  scaling  results.  Hypothetical  data  that  would  lead  to  opposite 
conclusions  about  a  hypothesized  sequence  of  drug  use  (from  alcohol  use  to  marijuana  use  to 
hard  drug  use)  are  shown.  Note  that  in  both  panels  of  Table  10  the  prevalence  of  alcohol  use 
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TableS 

RESPONSE  PATTERNS  FOR  791  RESPONDENTS 
FROM  KANDEL  AND  FAUST  (1975) 


Time  1 

Time  2 

Frequency 

Item 

123456 

Item 

123456 

000000 

000000 

36 

000000 

100000 

22 

000000 

1 10000 

3 

000000 

1  1  1  1  1  1 

1 

100000 

000000 

33* 

100000 

1  00000 

345 

100000 

1  10000 

76 

100000 

1  1  1000 

5 

1  1  0000 

100000 

35* 

1  1  0000 

1  10000 

106 

1  1  0000 

1  1  1000 

13 

1  1  0000 

111100 

5 

1  10000 

111110 

2 

1  1  1000 

100000 

8* 

1  1  1000 

1  10000 

12* 

1  1  1000 

1  1  1  000 

20 

1  1  1000 

111100 

5 

1  1  1000 

111110 

2 

1  1  1000 

1 1 1 1 1 1 

2 

111100 

100000 

8* 

111100 

1  10000 

13* 

111100 

1  1  1  000 

10* 

111100 

111100 

8 

111100 

111110 

9 

111110 

1  10000 

2* 

111110 

1  1  1  000 

3* 

111110 

111100 

3* 

111110 

100000 

1* 

111110 

1  10000 

1* 

111110 

1  1  1000 

2* 

NOTE;  Total  n  =  791.  0  =  not  passed,  1  =  passed.  Items 
are  legal  drugs,  cannabis,  pills,  psychedelics,  cocaine,  and  heroin. 
Asterisks  denote  longitudinal  relapses  (i.e.,  items  failed  at  time  2, 
but  passed  at  time  1). 
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Table  9 

SUBTANTIVE  EXAMPLE  ILLUSTRATING  AN  ABSENCE 
OF  LONGITUDINAL  TRANSITIONS 


Time  1 

Time  2 

Low 

Medium 

High 

Low 

Medium 

High 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

1 

1 

0 

1 

1 

0 

1 

1 

1 

1 

1 

1 

1 

0 

1 

1 

0 

1 

0 

1 

1 

0 

1 

1 

NOTE;  Time  #I  =  entry  into  kindergarten.  Time  #2  =  beginning  of 
first  grade.  Three  levels  of  achievement  are  defined:  low,  medium,  high. 


is  0.50  and  0.70  at  time  1  and  time  2,  respectively;  the  prevalence  of  marijuana  use  is  0.20 
and  0.30,  respectively;  and  the  prevalence  of  hard  drug  use  is  0.10  and  0.20,  respectively. 
However,  LCS  =  1.00  for  panel  A  and  0.38  for  panel  B.  Hence  the  data  in  panel  A  provide 
strong  support  for  the  hypothesized  sequence  of  drug  use  involvement  whereas  the  data  in 
panel  B  do  not. 
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Table  10 

SUBSTANTIVE  EXAMPLE  ILLUSTRATING  HOW  SAME 
PREVALENCE  RATES  CAN  LEAD  TO  DIFFERENT 
LONGITUDINAL  SCALOGRAM  RESULTS 


Time  1 

Time  2 

Ale 

Mar 

Hard 

Ale 

Mar 

Hard 

Panel  A: 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

1 

0 

1 

1 

0 

1 

1 

1 

1 

1 

1 

1 

1 

1 

Panel  B: 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

1 

0 

1 

0 

0 

0 

1 

0 

1 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

NOTE:  Time  #1  =  7ih  grade.  Time  #2  =  8ih  grade.  Three  levels  of 
drug  use  are  defined:  Ale  =  alcohol,  Mar  =  marijuana.  Hard  =  hard  drugs. 
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V.  AVAILABILITY 


Copies  of  LSA.EXE  on  a  floppy  diskette  may  be  obtained  from  Wm.  C.  Brown 
Publishers,  2460  Kerper  Blvd.,  Dubuque,  lA  52001;  phone  (319)  588-1451.  Employees  of 
RAND  may  contact  the  author  directly.  Questions  about  the  program  should  be  directed  to: 
Ron  D.  Hays,  Ph.D.,  Social  Policy  Department,  RAND,  1700  Main  Street,  Santa  Monica, 
CA  90407-2138. 


- 19- 


VI.  A  SUMMARY  OF  ALTERNATIVE  ANALYTIC  METHODS 


Collins,  Cliff,  and  Dent  (1988)  were  the  first  to  extend  cross-sectional  Guttman 
scaling  by  incorporating  the  element  of  time.  They  developed  the  Longitudinal  Guttman 
Simplex  (LGS)  method,  which  considers  four  kinds  of  relations  of  items  and  times. 
Redundant  time  relations  are  those  in  which  answers  given  to  a  pair  of  items  provide 
redundant  infonnation  about  two  time  points  (i.e.,  at  one  time  point  both  items  are  failed  and 
at  the  other  time  point  both  are  passed).  Redundant  item  relations  are  those  in  which  the 
answers  to  a  pair  of  items  match  at  two  time  points.  Unique  relations  are  those  in  which 
responses  to  only  one  item  in  a  pair  change  over  time.  Contradictory  relations  provide 
conflicting  information  about  the  relative  ordering  of  both  items  and  times. 

Collins,  Cliff,  and  Dent  (1988)  derived  a  consistency  index,  CL,  that  ranges  from 
negative  infinity  to  positive  one  (c.f.  Qiff,  1979).  The  weighting  scheme  used  to  compute 
CL  was  empirically  derived  based  on  the  ability  to  distinguish  random  from  nonrandom  data 
and  to  distinguish  among  data  known  to  differ  in  consistency  (Collins,  Cliff,  and  Denu 
1988).  Unique  relations  are  weighted  four  times  that  of  redundant  and  contradictory 
relations.  The  total  number  of  weighted  consistent  relations  is  computed  as  the  sum  of 
redundant  and  four  times  the  number  of  unique  relations  that  are  congruent  with  the  a  priori 
item-times  order.  The  proportion  of  consistent  relations  is  equal  to  the  total  number  of 
consistent  relations  divided  by  the  total  number  of  weighted  relations  (c.f.  Collins,  Cliff,  and 
Dent,  1988).  Rules  of  thumb  for  the  CL  index  have  been  suggested,  but  consensus 
guidelines  for  interpreting  this  coefficient  have  not  yet  been  developed  by  the  research 
community. 

The  LGS  method,  the  Longitudinal  Scalogram  Analysis  methodology  described  in 
this  manual,  and  traditional  Guttman  scalogram  analysis  all  ignore  measurement  error  and 
are  deterministic  in  the  sense  that  they  evaluate  the  extent  to  which  all  individuals  adhere  to 
the  same  basic  response  model.  Latent  structure  analysis,  a  probabilistic  analytic  procedure, 
offers  greater  flexibility  in  modeling  observed  response  patterns.  For  example,  Proctor 
(1970)  proposed  a  latent  structure  model  that  explicitly  allows  for  response  error.  The 
Proctor  model  assumes  that  each  scale  item  has  the  same  error  rate.  Qogg  and  Sawyer 
(1981)  presented  an  even  more  general  model,  allowing  for  specific  item  error  rates  and 
different  error  rates  for  different  types  of  respondents.  The  Proctor  model  and  Qogg  and 
Sawyer  procedures  arc  examples  of  latent  class  models.  Further  information  about  latent 


-20- 


class  analysis  generally  (McCutheon,  1987)  and  specific  applications  to  adolescent  drug  use 
(Graham,  et  al,  1991;  Sorenson  and  Brownfield,  1989)  are  provided  elsewhere. 
Item-response  theory  is  another  form  of  latent  structure  analysis  in  which  the  distribution  of 
the  latent  trait  is  assumed  to  be  continuous  (Hambleton  and  Swaminathan,  1985;  Traub  and 
Lam,  1985). 

Mixed-Markov  modeling  is  potentially  one  of  the  most  promising  approaches  for 
modeling  stage  transitions.  An  excellent  introduction  to  Mixed-Markov  models  is  given  by 
Uebersax,  et  al.  (1990). 
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